Self-evolving evaluation benchmarks research Internship
Self-evolving evaluation benchmarks research Internship
Cambridge
AstraZeneca is a global, science-led biopharmaceutical business and its innovative medicines are used by millions of patients worldwide! AstraZeneca Summer Internships introduce you to the world of ground-breaking drug development, embedding you in highly dedicatedteams,committed to delivering life-changing medicines to patients. Our 10–12-week program is designed for undergraduate, master's, and doctoral students. We offer exciting opportunities across Research & Development, Operations, and Enabling Units (Corporate functions).
Our internships immerse students in the pharmaceutical industry, allowing the opportunity to contribute to our diverse pipeline of medicines whether in the lab or outside of it. You will feel trusted and empoweredtotakeonnewchallenges, but with all the help and guidanceyouneedtosucceed. This internship will help you developessential skills, expand your knowledge, and build a network that will set you up for future success. Youwillbe surrounded by curious,passionate, and open-minded professionals eager to learn and follow the science, fostering your growth in a truly collaborative and globalteam.
Introduction to role
Join us at the Center for ArtificialIntelligence (CAI), where we design next‑generation evaluation methods for advanced agentic AIsystemsused across scientific workflows. In this role, you will contribute to a research project focused on developing self‑evolving benchmarking frameworks, where evaluation criteria continuously adapt based on model behaviour, evidence quality, and observed failure modes. You will explore how dynamic criteria, evidence‑grounded scoring, and adversarial testing can maintain benchmark discriminative power as AI systems improve. Working closely with experts in machine learning, scientific reasoning, and evaluation science, you will gain hands‑on experience building tools that support trustworthy and scalable assessment of AIsystemsused in multi‑agent scientific workflows.
Accountabilities
As an intern, youwillbe engaged with several key responsibilities, including:
- Developing a self-evolving benchmarking framework, incorporating dynamic rubric criteria.
- Designing and implementing evidence-grounded scoring mechanisms, ensuring that model claims and reasoning steps are supported by verifiable traces, tool outputs, or retrieved evidence.
- Investigating robustness and anti-gaming strategies, including adversarial testing to detect behaviours where models optimize the score without improving real-world quality.
- Building lightweight benchmarking tools, following solid software engineering practices to ensure reproducibility, traceability, and modularity.
- Analyzing model behaviour across multiple scientific task families, such as protocol drafting, reasoning chains, and multi-agent planning, to assess the generality of the evolving benchmark.
- Collaborating with scientists to identify key failure modes, highvalue assessment signals, and opportunities to integrate the benchmarking framework into scientific workflows.
Essential Skills/Experience
The ideal candidate will possess the following skills and experience:
Essential:
- Currently pursuing a PhD in computer science, machine learning, computational sciences, AI evaluation/robustness, or a related field.
- Strong experience with machine learning and deep learning methods, ideally including evaluation or alignment related work.
- Excellent Python programming skills; familiarity with frameworks such as PyTorch, JAX, or TensorFlow.
- Strong analytical mindset with enthusiasm for evaluation science, reliability, and AI governance
- Ability to work collaboratively in a teamenvironment and communicate scientific ideas effectively.
- Must be at least 18 years of age at time of application.
- Must have UK right-to-work status.
- Must return to schooling at program close (candidates graduating before/during the programmes are ineligible)
Desirable:
- Experience with benchmarking, evaluation rubrics, reinforcement learning from human/AI feedback, or model auditing.
- Familiarity with agentic AI systems, tool using models, multi-agent workflows, or long context reasoning analysis.
- Knowledge of rubric-based scoring, checklists, or structured evaluation frameworks.
- Experience with adversarial testing, generative model safety, or failure mode taxonomy development.
- Interest in applying evaluation science to scientific, biomedical, or protocol generation tasks.
This internship is a valuable opportunity to immerseyourselfincutting‑edge research on AI evaluation and robustness, with access to the necessary computational resources and mentorship from leading experts in the field. If you are ready to transform your technical knowledge into real-world applications, we encourage you to apply and become a part of ourteam driving innovation at AstraZeneca. Ourcollaborativeenvironmentisdesignedtohelpyougrowprofessionallyandpersonally,surroundedbypassionateindividualseagertomakeadifference.
AstraZeneca is where you can immerseyourselfingroundbreaking work with real patient impact.
Trusted to work on important projects, you’ll have the independence totakeonnewchallenges while receiving all the guidanceyouneedtosucceed.Ourcollaborativeenvironmentisdesignedtohelpyougrowprofessionallyandpersonally,surroundedbypassionateindividualseagertomakeadifference.
Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca, starting with the recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics.
We offer reasonableadjustments/accommodations to help all candidates to perform at their best. If you have a need for any reasonableadjustments/accommodations, please complete the section in the application form.
Ready to make an impact? Apply now and join us on this excitingjourney!
#Earlytalent
Date Posted
30-Jan-2026Closing Date
13-Feb-2026Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca and Alexion, starting with our recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics. We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any adjustments/accommodations, please complete the section in the application form.Join our Talent Network
Be the first to receive job updates and news from AstraZeneca
Sign up